Group variable selection via a hierarchical lasso and its oracle property
نویسندگان
چکیده
In many engineering and scientific applications, prediction variables are grouped, for example, in biological applications where assayed genes or proteins can be grouped by biological roles or biological pathways. Common statistical analysis methods such as ANOVA, factor analysis, and functional modeling with basis sets also exhibit natural variable groupings. Existing successful group variable selection methods have the limitation of selecting variables in an “allin-all-out” fashion, i.e., when one variable in a group is selected, all other variables in the same group are also selected [1, 23, 25]. In many real problems, however, we may want to keep the flexibility of selecting variables within a group, such as in gene-set selection. In this paper, we develop a new group variable selection method that not only removes unimportant groups effectively, but also keeps the flexibility of selecting variables within a group. We also show that the new method offers the potential for achieving the theoretical “oracle” property [6, 7].
منابع مشابه
A group bridge approach for variable selection.
In multiple regression problems when covariates can be naturally grouped, it is important to carry out feature selection at the group and within-group individual variable levels simultaneously. The existing methods, including the lasso and group lasso, are designed for either variable selection or group selection, but not for both. We propose a group bridge approach that is capable of simultane...
متن کاملVariable Selection and Estimation in High-dimensional Varying-coefficient Models.
Nonparametric varying coefficient models are useful for studying the time-dependent effects of variables. Many procedures have been developed for estimation and variable selection in such models. However, existing work has focused on the case when the number of variables is fixed or smaller than the sample size. In this paper, we consider the problem of variable selection and estimation in vary...
متن کاملVariable Selection for Cox’s Proportional Hazards Model and Frailty Model By
A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed in Fan and Li (2001a). It has been shown there that the resulting procedures perform as well as if the subset of significant variables were known in advance. Such a property is called an oracle property. The proposed procedures were illustrated in the context of linear regression, rob...
متن کاملPath consistent model selection in additive risk model via Lasso.
As a flexible alternative to the Cox model, the additive risk model assumes that the hazard function is the sum of the baseline hazard and a regression function of covariates. For right censored survival data when variable selection is needed along with model estimation, we propose a path consistent model selector using a modified Lasso approach, under the additive risk model assumption. We sho...
متن کاملThe Adaptive Lasso and Its Oracle Properties
The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain scenarios where the lasso is inconsistent for variable selection. We then propose a new version of ...
متن کامل